SQL: Auto union Iceberg and Redpanda topic by kbatuigas · Pull Request #575 · redpanda-data/cloud-docs

kbatuigas · 2026-05-04T22:59:07Z

Description

This pull request adds comprehensive documentation for Redpanda SQL's new support for Iceberg catalogs and bridge queries, enabling users to query both live Redpanda topics and their Iceberg-committed history. It introduces reference and how-to content for creating, altering, and dropping Iceberg catalogs, details the new USING CATALOG clause for Redpanda catalogs, and provides a step-by-step guide for querying Iceberg-enabled topics.

New SQL statement documentation:

Added reference pages for CREATE ICEBERG CATALOG, ALTER ICEBERG CATALOG, and DROP ICEBERG CATALOG, including syntax, options (covering authentication and TLS), and usage examples. [1] [2] [3]
Updated navigation to include the new Iceberg catalog statement references.

Enhancements to Redpanda catalog documentation:

Documented the new USING CATALOG clause for CREATE REDPANDA CATALOG, which links a Redpanda catalog to an Iceberg catalog for bridge queries.
Added the pandaproxy_url option, required when using USING CATALOG, and provided an example of creating a linked catalog. [1] [2]

How-to guide for querying Iceberg-enabled topics:

Added a step-by-step guide describing how to set up storage, Iceberg, and Redpanda catalogs, map topics as SQL tables, and run bridge queries that span live and historical data. The guide also explains prerequisites and links to related reference content.

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 18 May

Page previews

Redpanda SQL > Query Data > Query Iceberg topics
Reference > Redpanda SQL Reference > Statements
CREATE ICEBERG CATALOG
ALTER ICEBERG CATALOG
DROP ICEBERG CATALOG
CREATE REDPANDA CATALOG > Create catalog linked to Iceberg catalog

Checks

New feature
Content gap
Support Follow-up
Small fix (typos, links, copyedits, etc)

netlify · 2026-05-04T22:59:11Z

✅ Deploy Preview for rp-cloud ready!

Name	Link
🔨 Latest commit	`82d7ea9`
🔍 Latest deploy log	https://app.netlify.com/projects/rp-cloud/deploys/6a10ff5a8b30aa0008446a7e
😎 Deploy Preview	https://deploy-preview-575--rp-cloud.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai · 2026-05-04T22:59:13Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8fd7b831-7aae-40d8-9100-7551403f2474

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

kbatuigas · 2026-05-13T18:51:51Z

+    warehouse = 's3://lakehouse-data/',
+    auth_type = 'oauth2',
+    oauth2_client_id = '<client-id>',
+    oauth2_client_secret = '<client-secret>',


Would we tell users to create and reference secrets here in the same way we do for RP Cloud secrets such as for catalog credentials in the cluster config, like so? https://deploy-preview-575--rp-cloud.netlify.app/redpanda-cloud/manage/iceberg/use-iceberg-catalogs/#use-a-secret-in-cluster-configuration

Greketrotny · 2026-05-18T08:46:18Z

+
+// TODO: SME — confirm when REFRESH must be run on the linked Iceberg table. Source shows that if the Iceberg table's schema isn't refreshed, the query fails at planning time with: `Schema not found for Iceberg table '<table>'. Run: REFRESH <catalog>=><table>`. Confirm:
+//   - Is REFRESH required only the first time, or every time the Iceberg schema changes?
+//   - Is REFRESH required when new records are added to the Iceberg table (no schema change), or only on schema change?


REFRESH pertains to the schema/shape of the table only. The whole point of the bridge queries is to fetch all data without refreshing anything.

Greketrotny · 2026-05-18T08:48:51Z

+== Query live and historical records together
+
+// TODO: SME — confirm when REFRESH must be run on the linked Iceberg table. Source shows that if the Iceberg table's schema isn't refreshed, the query fails at planning time with: `Schema not found for Iceberg table '<table>'. Run: REFRESH <catalog>=><table>`. Confirm:
+//   - Is REFRESH required only the first time, or every time the Iceberg schema changes?


Currently it's advised to run REFRESH at any shape change from any side, as that's the view Oxla has and considers during preparing the query. However, I believe @Bixkog is working on performing the REFRESH on all tables automatically on creation of the catalogs, so the tables will be visible immediately after creating the catalogs. The refresh is needed on any subsequent shape change, though, anyway.

kbatuigas · 2026-05-19T05:09:38Z

+
+== Set up the Iceberg query catalogs
+
+You create three objects, in this order: a storage connection, an Iceberg catalog, and a Redpanda catalog that links to the Iceberg catalog. The storage and Iceberg-catalog options must match the cluster's xref:manage:iceberg/rest-catalog/index.adoc[REST catalog configuration] (endpoint, credentials, region).


@Greketrotny if you wanted to continue with default_redpanda_catalog, are you still required to execute CREATE STORAGE, and CREATE ICEBERG CATALOG? Or could you go straight to CREATE TABLE default_redpanda_catalog=>orders (and the storage connection and iceberg catalog are auto-linked under the hood) for an Iceberg-enabled topic?

Having an iceberg-enabled topic implies, that the CREATE STORAGE, CREATE ICEBERG CATALOG, and CREATE KAFKA CATALOG ... USING ... queries have been invoked by definition. So if the default_redpanda_catalog is provisioned in such a way (I assume it is, when the Enable button next to Oxla cluster is selected when creating a BYOC cluster), then the user can jump right into creating tables, and then running REFRESH, if it's not ran automatically.

@Greketrotny I reworked this a little so that instead of telling users to execute those CREATEs, the page instead explains that that's done for them: https://deploy-preview-575--rp-cloud.netlify.app/redpanda-cloud/sql/query-data/query-iceberg-topics/#how-the-query-catalogs-are-set-up let me know if this approach doesn't work. cc @mattschumpert

looks accurate

Feediver1

PR Review: SQL: Auto union Iceberg and Redpanda topic (#575)

Files reviewed: 6 .adoc files (466 additions / 17 deletions)
Overall assessment: The cleanest of the SQL GA series. Already engineer-APPROVED. Two broken xrefs are sibling-PR dependencies; same missing What's New entry; no style issues to flag. Ready to land as soon as #571 and #574 do.

What this PR does

Adds Redpanda SQL's Iceberg-bridge query support on the rp-sql integration branch:

modules/sql/pages/query-data/query-iceberg-topics.adoc (new, 123 lines) — how-to that explains the auto-provisioned storage/Iceberg/Redpanda-catalog chain, maps a topic as a SQL table, runs a query that spans live + Iceberg history, and covers schema-divergence rules.
modules/reference/pages/sql/sql-statements/create-iceberg-catalog.adoc (new, 210 lines) — reference for the new statement with full options (uri, warehouse, four auth types — none/OAuth2/basic/AWS SigV4, TLS settings) and five worked examples.
modules/reference/pages/sql/sql-statements/alter-iceberg-catalog.adoc (new, 40 lines) — modify connection properties of an existing Iceberg catalog.
modules/reference/pages/sql/sql-statements/drop-iceberg-catalog.adoc (new, 33 lines) — drop an Iceberg catalog, with the documented constraint that a linked Redpanda catalog must be detached or dropped first.
modules/reference/pages/sql/sql-statements/create-redpanda-catalog.adoc (56+ / 16−) — adds the new USING CATALOG clause, the conditional pandaproxy_url option, and a fourth example showing the linked-catalog form.
modules/ROOT/nav.adoc — wires the new pages into the nav tree.

Jira ticket alignment

Ticket: DOC-2006 — "Document feature auto union Iceberg and Redpanda topic" (extracted from branch name).

Status: The PR delivers everything the ticket implies — bridge-query how-to + the three new Iceberg catalog SQL statements + USING-CATALOG documentation on the existing Redpanda catalog page. Greketrotny's engineering review addressed the REFRESH semantics and drop-with-link constraint, both of which are now documented correctly. ✓

Critical issues (must fix)

Two broken xrefs to sibling-PR targets (verified missing on rp-sql):

File:line	xref target	Provided by
`query-iceberg-topics.adoc:13`	`sql:query-data/query-streaming-topics.adoc[]`	PR #574 (still OPEN)
`query-iceberg-topics.adoc:21`	`sql:get-started/deploy-sql-cluster.adoc[Enable Redpanda SQL]`	PR #571 (still OPEN)
`query-iceberg-topics.adoc:116`	`sql:query-data/query-streaming-topics.adoc[Query streaming topics]` (Next steps)	PR #574 (still OPEN)

Fix: Same as #571 and #574 — coordinate merge ordering so all sibling PRs land on rp-sql before rp-sql lands on main. These three xrefs will all resolve once #574 and #571 have merged.

Missing What's New entry. Third PR in the SQL GA series with no entry in modules/get-started/pages/whats-new-cloud.adoc. As I noted in #571 and #574, a single coordinated "Redpanda SQL: General availability" entry for the whole series would cover this PR alongside the get-started, streaming-query, and OIDC/access pages.
- Fix: Add the announcement entry once before any of #571 / #574 / #575 / #580 lands.

Suggestions (should consider)

Filename collision with the existing manage:iceberg/query-iceberg-topics.adoc. Two pages now share the basename query-iceberg-topics.adoc:
- modules/sql/pages/query-data/query-iceberg-topics.adoc (new, this PR) — nav label: "Query Iceberg-enabled Topics"
- modules/manage/pages/iceberg/query-iceberg-topics.adoc (pre-existing) — nav label: "Query Iceberg Topics"
They live in different modules and the display labels differ, so navigation isn't actually ambiguous to readers. But it's the kind of overlap that bites later (typo'd xref lands on the wrong page; future grep across query-iceberg-topics returns both). Worth confirming this is intentional and that the two scopes (Redpanda SQL vs. external query engines) are clearly distinct in each page's intro.
// TODO placeholder in query-iceberg-topics.adoc:102:
```
// TODO: Verify with engineering whether there are workload patterns that
// reliably trigger longer planning, and document them if so (qa-questions.md #22).
```
- Suggested: Open a follow-up ticket and reference it here (e.g., // TODO(DOC-XXXX): ...) so the TODO has an issue trail rather than a freeform pointer to a private QA document.

Impact on other files

modules/ROOT/nav.adoc ✓ — all four new pages added at lines 357 (how-to), 541 (alter), 545 (create), 549 (drop).
modules/get-started/pages/whats-new-cloud.adoc ❌ — no SQL GA entry (Critical #2).
Cross-page consistency: the new how-to refers to bridge queries planning a union internally; the create-redpanda-catalog reference and create-iceberg-catalog reference describe the same model from their respective directions; create-redpanda-catalog's new "linked catalog" example is consistent with the auto-provisioned ALTER REDPANDA CATALOG ... USING CATALOG shown in the how-to. No drift detected. ✓
Cross-component xrefs verified inside this PR's content:
- xref:reference:properties/cluster-properties.adoc#iceberg_catalog_type ✓
- xref:manage:iceberg/rest-catalog/index.adoc ✓
- xref:manage:iceberg/about-iceberg-topics.adoc ✓
- xref:sql:connect-to-sql/index.adoc ✓
- xref:reference:sql/sql-statements/create-storage.adoc ✓ (already on rp-sql)
- All intra-PR xrefs (create/alter/drop/redpanda-catalog statements) ✓
- Only the two sibling-PR xrefs above are unresolved.
Pre-existing manage:iceberg/query-iceberg-topics.adoc is unchanged by this PR and still in nav at line 433 — no breakage there.

CodeRabbit findings worth considering

None. CodeRabbit's check passed with no actionable findings.

Outstanding review activity (not findings — just status)

Greketrotny's APPROVED (May 14 and May 15, two approvals on file).
His follow-up COMMENTED reviews (May 18, May 20) raised three specific points:
- "REFRESH pertains to the schema/shape of the table only" → current diff at lines 93–95 reflects this precisely.
- "User cannot drop an iceberg catalog when there is another Kafka catalog linking" → documented at drop-iceberg-catalog.adoc:7–8.
- "Don't document [this behavior]" on create-redpanda-catalog → Kat replied "Removed", and the current diff is clean.
Greketrotny's final comment on the how-to was "looks accurate". No outstanding engineer-blocking concerns.

What works well

Engineering-validated content. Greketrotny iterated on the technical specifics and ended with "looks accurate". The REFRESH guidance, the drop-with-link constraint, and the catalog-provisioning chain were all verified by the implementer.
"How the query catalogs are set up" section is genuinely useful — it shows the three SQL statements Cloud runs under the hood so administrators can reason about the setup without being asked to run them, and so debugging-via-DESCRIBE makes sense.
"Handle schema differences" section documents a real edge case (Iceberg table holding columns the topic schema no longer has, and the resulting planning-time error) with the exact resolution.
Comprehensive options tables in the new reference pages — covering all four auth types in CREATE ICEBERG CATALOG plus TLS settings, plus AWS-default-credential-chain behavior for SigV4.
Examples per reference page demonstrate every distinct usage shape — none of the three example pages is a single-snippet page.
Related statements cross-link table at the bottom of create-iceberg-catalog.adoc makes lateral navigation between catalog statements easy.
Frontmatter compliance: description, :page-topic-type: (how-to / reference), learning objectives observable and measurable, personas (app_developer, data_engineer) correctly scoped to the query-side audience.
No em dashes anywhere in the new content (clean compared to #574).
All H2+ headings in sentence case; H1s in title case (or all-caps SQL keywords). ✓
All code blocks have explicit [source,sql] language tags. Consistent with the SQL module's convention.
CI fully green and the Netlify preview links cover the four new pages.

Final-pass review via /docs-team-standards:pr-review.

Feediver1

@kbatuigas Other than the dependencies on other PRs being merged, and the What's new update, this looks clean. Going to approve with the understanding that the other PRs will be merged, and What's new update.

kbatuigas force-pushed the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch 2 times, most recently from 03d3421 to 3fd1c5a Compare May 11, 2026 23:06

kbatuigas force-pushed the rp-sql branch from 248d62d to 3c582b2 Compare May 13, 2026 18:10

kbatuigas force-pushed the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch from 44f89d0 to 72c11f6 Compare May 13, 2026 18:49

kbatuigas commented May 13, 2026

View reviewed changes

kbatuigas requested a review from Greketrotny May 13, 2026 18:53

kbatuigas marked this pull request as ready for review May 13, 2026 18:53

kbatuigas requested a review from a team as a code owner May 13, 2026 18:53

kbatuigas requested a review from mattschumpert May 13, 2026 18:54

Greketrotny approved these changes May 14, 2026

View reviewed changes

Comment thread modules/reference/pages/sql/sql-statements/drop-iceberg-catalog.adoc

Greketrotny approved these changes May 15, 2026

View reviewed changes

Greketrotny reviewed May 18, 2026

View reviewed changes

Comment thread modules/reference/pages/sql/sql-statements/create-redpanda-catalog.adoc Outdated

Greketrotny reviewed May 18, 2026

View reviewed changes

kbatuigas force-pushed the rp-sql branch from e051360 to 1b9d587 Compare May 19, 2026 03:26

kbatuigas force-pushed the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch from 95b585a to 6999a6c Compare May 19, 2026 03:30

kbatuigas commented May 19, 2026

View reviewed changes

kbatuigas mentioned this pull request May 20, 2026

SQL: bytea support #585

Merged

4 tasks

This was referenced May 21, 2026

SQL GA - Get started #571

Merged

SQL: Query topics #574

Merged

Feediver1 reviewed May 21, 2026

View reviewed changes

Feediver1 approved these changes May 21, 2026

View reviewed changes

This was referenced May 21, 2026

SQL: OIDC and access management #580

Merged

SQL: OOM #584

Merged

kbatuigas force-pushed the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch from 4444b5c to 1a233e6 Compare May 22, 2026 23:44

kbatuigas added 3 commits May 22, 2026 18:13

Start reference update for SQL / Iceberg catalog

254d276

Draft bridge queries doc

65cc88d

Update query-iceberg-topics per SME feedback

8fc0797

kbatuigas added 12 commits May 22, 2026 18:13

Add new iceberg catalog statements

1c2ad12

Clarify Iceberg benefit of querying data aged out of topic retention

3480df3

Links to Iceberg catalog docs

9effb0b

Update example names

4ae0bc3

Update based on new-features list

d6628fe

Sync options from source

7ab5d64

Apply suggestions from SME feedback

50a9a84

Apply suggestions

7356b78

Note about default schema

441f215

Update page title

0ca9c8a

Reframe per SME input

61c6d1a

Remove TODO

82d7ea9

kbatuigas force-pushed the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch from 1a233e6 to 82d7ea9 Compare May 23, 2026 01:14

kbatuigas merged commit 2979670 into rp-sql May 23, 2026
5 checks passed

kbatuigas deleted the DOC-2006-document-feature-auto-union-iceberg-and-redpanda-topic branch May 23, 2026 01:17


		== Set up the Iceberg query catalogs

		You create three objects, in this order: a storage connection, an Iceberg catalog, and a Redpanda catalog that links to the Iceberg catalog. The storage and Iceberg-catalog options must match the cluster's xref:manage:iceberg/rest-catalog/index.adoc[REST catalog configuration] (endpoint, credentials, region).

Conversation

kbatuigas commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Page previews

Checks

Uh oh!

netlify Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for rp-cloud ready!

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

kbatuigas May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Greketrotny May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Greketrotny May 18, 2026

Choose a reason for hiding this comment

Uh oh!

kbatuigas May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Greketrotny May 20, 2026

Choose a reason for hiding this comment

Uh oh!

kbatuigas May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Greketrotny May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Feediver1 left a comment

Choose a reason for hiding this comment

PR Review: SQL: Auto union Iceberg and Redpanda topic (#575)

What this PR does

Jira ticket alignment

Critical issues (must fix)

Suggestions (should consider)

Impact on other files

CodeRabbit findings worth considering

Outstanding review activity (not findings — just status)

What works well

Uh oh!

Feediver1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kbatuigas commented May 4, 2026 •

edited

Loading

netlify Bot commented May 4, 2026 •

edited

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading